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o 

(N 

in 



o 



(N 



GIUSEPPE JURMAN 1 , MICHELE FILOSI 1 ' 2 , ROBERTO VISINTAINER 13 , 
SAMANTHA RICCADONNA 1 and CESARE FURLANELLO 1 



ABSTRACT 



The number of algorithms available to reconstruct a biological network from a dataset 
of high-throughput measurements is nowadays overwhelming, but evaluating their 
performance when the gold standard is unknown is a difficult task. Here we propose 
to use a few reconstruction stability tools as a quantitative solution to this problem. 
J~3^ We introduce four indicators to quantitatively assess the stability of a reconstructed 
network in terms of variability with respect to data subsampling. In particular, we 
^ . give a measure of the mutual distances among the set of networks generated by a 
■^)- \ collection of data subsets (and from the network generated on the whole dataset) and 
^ we rank nodes and edges according to their decreasing variability within the same 
t-h ' set of networks. As a key ingredient, we employ a global/local network distance 
■ combined with a bootstrap procedure. We demonstrate the use of the indicators in 
a controlled situation on a toy dataset, and we show their application on a miRNA 
microarray dataset with paired tumoral and non-tumoral tissues extracted from a 
cohort of 241 hepatocellular carcinoma patients. 
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1 INTRODUCTION 

The problem of inferring a biological network structure starting from a set of high-throughput 
measurements (e.g. gene expression arrays) has been positively answered by a huge number of 
deeply different solutions published in literature in the last fifteen years. Nonetheless, network 
reconstruction suffers from being a underdetermined proble m, being the number of intera ctions 



highly larger than the number of independent measurements (IDe Smet and Marchall . I201CH ) : thus 
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any algorithm has to look for a compromise between accuracy and feasibility, allowing simplifi- 
cations that inevitably mine th e precision of the final o utcome, for instance including a relevant 



number of false positive link s ( Kamburov et all 120121 ). This makes the inference problem "a 



daunting task" ( Baralla et al. . 2009h . not only in terms of devising an effective algorithm, but also 



in terms of quantitatively interpreting the obtained results. In general, the reconstruc tion accuracy 



is far from being optimal in many situations with the presence of several pitfalls (IMeyer et al 



201ll ). related to both the methods and the data (IHe et all 120091 ) . with the extreme situ ation 
of many link prediction being statistically equivalent to random guesses (jPrill et all |2010| ). In 



particular, the size (and the q uality) of the availabl e data play a critical role in the i nference pro 
cess, as widely acknowledged ( Logsdon and Mezey . 20ld : 



Gillis and Pavlidis. 2011: Miller et al. 



2012) . All these consider ations support deeming network reconstruction a still unsolved problem 



(jSzederkenyi et all 1201 ll ). 

Despite the ever rising number of available algorithms, only recently efforts have been carried 
out to wards an objective compar i son of network inference met hods also highlighting current limita- 
tions ( Altav and Emmert-Streibl 2010 ; Krishnan et al. . 2007 ) and relative strengths and disadvan- 
tages ( Madhamshettiwar et al. . 20121) . Am ong those, it is worthwhile mentioning the international 
DREAM challenge (IMarbach et alll2010l ). whose key result in the last edition advocated integra- 
tion of predictions from multiple inference methods as an effective strat egy to enhance perfor- 
manc es taking advantage from the different algorithms' complementarity ( IDe Smet and Marchall . 



20101 ). Nevertheless, the algorithm uncertainty has been so far assessed only in terms of perfor- 
mance, i.e. distance of the reconstructing network from the ground truth, wherever available, 
while not much has been instead investigated with respect to the stability of the methods. This 
can be of particular interest when no gold standard is available for the given problem, and thus 
there is no chance to evaluate the algorithm's accuracy, leaving the stability as the sole rule of 
thumb for judging the reliability of the obtained network. Here we propose to tackle the issue 
by quantifying inference variability with respect to data perturbation, and, in particular, data 
subsampling. If a portion of data is randomly removed before inferring the network, the resulting 
graph is likely to be different from the one reconstructed from the whole dataset and, in general, 
different subsets of data would generate different networks. Thus, in the spirit of applying repro- 
ducibility principles to this field, one has to accept the compromise that the inferred/non inferred 
links are just an estimation, lying within a reasonable probability interval. In brief, we aim at 
proposing a set of four indicators allowing the researcher to quantitatively evaluate the reliability 
of the inferred/non- inferred links. In detail, we quantitatively assess, for a given ratio of removed 
data and for a give number of resampling, the mutual distances among all inferred networks and 
their distances to the network generated by the whole dataset, with the idea that, the smaller the 
average distance, the stabler the network. Moreover, we provide a ranked list of the stablest links 
and nodes, where the rank is induced by the variability of the link weight and the node degree 
across the generated netwo rks, the less variable being the top ranked. As a network distance we 
employ the HIM distance (IJurman et all 120121 ). which represents a good compromise between 



local (link-based) and global (structure-based) measure of network comparison. 

As a first testbed in a controlled situation the four indicators are computed on a synthetic 
dataset for different instances of a correlation network with different measures, highlighting the 
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impact of a FDR filter on the network reconstruction method. Finally, we show the use of the 
stability measures in comparing the relevance networks inferred on a miRNA micr oarray dataset 



with paired tissues extracted from a cohort of 241 hepatocellular carcinoma patients ( iBudhu et al. 
20081 ). Data have two phenotypes, related to disease (tumoral or non-tumoral tissues) and to 



patient's sex (male or female), allowing the construction of four different networks, displaying 
different levels of stability. 

Due to the relevant computational workload, all the analysis were run as R and Python scripts 
on multicore workstations and on the FBK HPC facility Kore Linux cluster. 



2 METHODS 

Before defining the four stability indicators we briefly summarize the main definitions and prop- 
erties of the HIM network distance. 

2.1 HIM Network Distance 



(llpsen and Mikhailov 



The HIM distance (IJurman et all 120121) is a metr i c for network comparison combining an edit 
distance (Hamming (Tun et all 2006 : Dougherty , 2010)) and a spe ctral one (Ipsen-Mikhailov 



20021 )). As discussed in (jJurman et all 1201 ll ). edit distances are local 



that is they focus only on the portions of the network interested by the differences in the pres- 
ence/absence of matching links. Spectral distances evaluate instead the global structure of the 
compared topologies, but they distinguish isomorphic or isospectral graphs, which can correspond 
to quite different conditions within the biological context. Their combination into the HIM dis- 
tance represents an effective solution to the quantitative evaluation of network differences. 

Let A/i and M2 be two simple networks on iV nodes, described by the corresponding adjacency 
matrices A\ and A 2 , with aff ,a$ G J 7 , where T = F 2 = {0,1} for unweighted graphs and 
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J- = [0, 1] for weighted networks. Denote then by In the identity N x N matrix In = 

V ■■■ 1 . 

by ljv the unitary N x N matrix with all entries equal to one and by 0^ the null N x N matrix 
with all entries equal to zero. Finally, denote by Sn the empty network with iV nodes and no 
links (with adjacency matrix Ojv) and by JFn the undirected full network with N nodes and all 
possible N(N — 1) links (whose adjacency matrix is ljy — In). 
The definition of the Hamming distance is the following: 



Hamming (A/"i , A2 



l<i^j<N 



A 



(2) I 
ij I 



To guarantee independence from the network dimension (number of nodes), we normalize the 
above function by the factor 7} = Hamming(£7v, F N ) = N(N — 1): 



N(N-l) 



E 



l<i^j<N 
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When A/i and A/2 are unweighted networks, i7(A/i, A/2) is just the fraction of different matching 
links (over the total number iV(iV — 1) of possible links) between the two graphs. In all cases, 
H(Afx, A/2) G [0, 1], where the lower bound is attained only for identical networks A\ = A2 and 
the upper bound 1 is reached whenever the two networks are complementary Ai+A2 = In — 5v = 



1 

1 



1 1 ••• 0, 

Among spectral distances, we consider the Ipsen -Mikhailov distance IM which has been proven 



to be the most robust in a w ide range of situations (IJurman et all 1201 ll ). Originally introduced in 



(llpsen and Mikhailovi . |2002| ) as a tool for network reconstruction from its Laplacian spectrum, the 
definition of the Ipsen-Mikhailov metric follows the dynamical interpretation of a N-nodes network 
as a N-atoms molecule connected by identical elastic strings, where the pattern of connections is 
defined by the adjacency matrix of the corresponding network. The dynamical system is described 
by the set of N differential equations 



N 

j'=i 



for i = 0, 



N-l 



We recall that the Laplacian matrix L of an undirected network is defined as the difference 
between the degree D and the adjacency A matrices L = D — A, where D is the diagonal 
matrix with vertex degrees as entries. L is positive semidefinite and singu lar (jChungi . 11997 ; 
At ay et all 120061 : ISpielmanl . 120091 : iTonjes and Blasiud . 120091 ; lAtay et al.L 120061 ) , so its eigenvalues 
are = Ao < Ai < • ■ ■ < Ajv-i- The vibrational frequencies Ui for the network model in Eq. [2] are 
given by the eigenvalues of the Laplacian matrix of the network: A, = uf, with \ = u = 0. The 
spectral density for a graph as the sum of Lorentz distributions is defined as 



N-l 



1 



i=l 



(u - Ui) 2 + 7 2 



where 7 is the common width and K is the normalization constant defined as 

1 



K 



N-l r oo 

7£ 



1=1 



POO 

so that / p(u>, 7)dw = 1. The scale parameter 7 specifies the half-width at half-maximum, which 
Jo 

is equal to half the interquartile range. Then the spectral distance e 7 between two graphs G and 
H on N nodes with densities pc(u, 7) and Ph(w, 7) can then be defined as 



ey(G,H) 



[PgO,7) - Ph(u,i)} 2 
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(a) (b) 

Figure 1: An example of HIM distance, (a) Network A (top) and Network B (bottom); (b) 
Representation of the HIM distance in the Ipsen-Mikhailov and Hamming distance space between 
networks A versus B, C and D, where C is the fully connected network and D is the empty one. 



The highest value of e 7 is reached, for each N, when evaluating the distance between En and Fn- 
Defining 7 as the (unique) solution of 

we can now define the normalized Ipsen-Mikahilov distance as 



IM(G, H) = e T (G, H) = 




so that IM(G, H) G [0,1] with upper bound attained only for (G,H) = (£n,Fn)- Finally, the 
HIM distance is defined as the product metric of the normalized Hamming distance H and the 
normalized Ipsen-Mikhailov IM distance, normalized by the factor Vz to set its upper bound to 
1: 

HIM(iVi,JV 2 ) = -^v /H ( Ar i^2) 2 + IM(A^ 1 ,A^ 2 ) 2 
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We can represent the HIM distance in the [0, 1] x [0, 1] Hamming/Ipsen-Mikhailov space, where 
a point P(x, y) represents the distance between two networks N\ and N 2 whose coordinates are 
x = H(iVi, N 2 ) and y = IM(JVi, N 2 ) and the norm of P is a/2 times the HIM distance HIM(iV 1 , N 2 ). 
The same holds for weighted networks, provided that the weights range in [0; 1]. In Fig. [T] we 
provide an example of this representation of the HIM distance between networks of four nodes. 
Roughly splitting the Hamming/Ipsen-Mikhailov space into four main zones I, II, III, IV as in Figure 
[U we can say that two networks whose distances correspond to a point in zone I are quite close 
both in terms of matching links and of structure, while those falling in the zone III are very 
different with respect to both characteristics. Networks corresponding to a point in zone II have 
many common links, but their structure is rather different, while a point in zone IV indicates two 
networks with few common links, but with similar structure. Full mathematical details about the 
HIM distance and its two components H and IM are available in (IJurman et all 120121 ). 



2.2 Stability indicators 

We introduce now the four stability indicators, for a given subset of the original data and a given 
number of replicates, producing a set of corresponding inferred networks. The first two indicators 
concern the stability of the entire network, measuring the mutual distances of the networks inferred 
from the different replicates and their distances to the network constructed on the whole dataset. 
The other two indicators concern instead the stability (and thus the reliability) of the single nodes 
and links, in terms of mutual variability of their respective degree and weight. In Fig. [2] we detail 
the mathematical formulation of the four indicators: the smaller the indicators' values, the stabler 
the indicators' targets. In particular, for all experiments on both synthetic and biological datasets 
we used n = s — 1, r = 1 [leave-one-out stability, LOO for short], and 20 different instances of 
A;-fold cross validation (discarding the test portion) for k = 2,4, 10 (denoted by k2, k4 and klO in 
what follows), and thus n = [ s( - fc ~ 1 ' > j and r = 20 A;. 



3 RESULTS 



3.1 FDR effect on correlation networks 

As a first experiment, we want to assess the different level of stability in a correlation network in- 
ferred by a set of synthetic high-throughput signals when the inference (absolute value of Pearson 



corre lation) is computed with or without False Discovery Rate control (see for instance (IJiao et al 
201ll )). A s the correlatio n measure, we use the classical (absolute) Pearson correlation of the 



WGCNA (iHorvathl . 1201 ll ) and the novel correlation measure called Maximal Information Coeffi- 
cient (MI C), componen t of th e Maximal Information-based Nonparametr ic Exploration (MINE) 
statistics feeshef et all 1201 ll : ISpeedl . l201ll : iNature Biotechnology! . l2012h . For a set of values 
n < m and an adequate number of resampling r = min{20, (™)}, compute the indicators Ij(n, r) 
for j = 1, . . . , 4 for all the used algorithms. 
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1. Given a dataset D with s samples and p features, reconstruct (with a chosen algorithm 
ALG) the network Np on the whole dataset D; denote the p nodes of No by xf, . . . , 
and its edges' weight by a^ k , for k, h = 1, . . . ,p. 

2. Choose two integers n, r with n < s and r < f *), and build a set T>{ n ,r) = {-Di> • • • -^V} 
where -Dj is a dataset built choosing n samples from D. 

3. Reconstruct, by using the same algorithm ALG, the corresponding networks on the 
subsampled data. 

4. Compute the following indicators: 

• h(n,r) = {EIM(N D ,N Di ):i = l,...,r} 

• I 2 (n, r) = {HIM(iV Di , N Dj ) : i, j = 1, . . . r, i / j} 

• h{n,r) = {a®%} for i = 1, . . . ,r and k,h = l,...,p 

• Zi(n, r) = for i = 1, . . . , r and h = l,...,p and d the degree function. 

5. For each set of values Ii compute the mean, the range (defined as the difference between 
maximum and minumum va lue) and the 95% studentized bootstrap confidence intervals 
(IDayison and Hinklevl . 119971 ) as implemented in the R package boot (jCantv and Riplevl . 



2012 ). 



6. Comparative analysis of the statistics of the four indicators will describe the 

level of confidence (stability) in the network Np, in its links and in its nodes. 



Figure 2: Definition of the four stability indicators I±, . . . , I4. 



3.1.1 Data generation 

As a synthetic benchmark for evaluating differences between Pearson and MIC correlation mea- 
sures, and to assess the impact of the FDR filter on the construction of a correlation network, 
we built a dataset S consisting of 100 measurements (samples) of 20 variables (features) /j, from 
which we constructed the corresponding correlation networks on 20 nodes. The dataset S was gen- 
erated starting from its correlation matrix Ms, which was randomly generated with the following 
three constraints: 

{0.9 for 1 < i ^ j < 5 
0.7 for6<^j<10 
0.4 for 11 < % ^ j < 16 , 

for Corr the Pearson correlation. The correlation matrix M5 is plotted in Fig. HJ clearly, the 
correlation values in the three groups defined by the above constraints represent true relations 
between the variables, while all other smaller correlation values are due to the underlying random 
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1. Let D a dataset with m samples described by q features, and let C(h,k) = \cor(xh, Xk)\ 
where Xj is the j-th feature of D across the m samples and cor is a correlation measure. 

2. Build the standard correlation network Njj using the rule a^k = C(h, k) 

3. Build the FDR controlled (at p-value p = W~ z ) correlation network M|J using the rule 



O-hk 



C(h,k) if < 1 

otherwise, 



where the set F z is defined as follows 

Ff) = {cor (ai(x h ),Ti(x k )) > C(h,k): a^n £ S m ,i = 1, . . . , max{10 z , ml}} (3) 



Figure 3: Construction of a FDR-corrected correlation network. 



generation model for Mg. 

3.1.2 Results 

Starting from the dataset S we built five correlation networks, using MIC, absolute Pearson corre- 
lation without FDR correction (WGCNA) and absolute Pearson correlation with FDR correction, 
with p-values p = 10~ 2 , 5 • 10~ 3 , 10~ 4 . The plots of the graphs for three of the networks are dis- 
played in Fig. [5j As expected, while the WGCNA networks with highest FDR correction p = 
is discarding all links as not significant apart from the edges connecting the two disjoint sets of 
nodes 1 < i < 5} and 6 < i < 11} (the strongest correlations in the matrix Ms), 
WGNCA and MIC generates two fully connected networks with a majority of weak links. Then 
we computed the four indicators Jj, . . .I4 for all the five networks described above, in the setup 
described in Sec. 12.21 Main statistics for all the indicators I\ and I2 are reported in Tab. [1] and 
displayed in Fig. [6l 

As expected, the ratio of the discarded data has a strong impact on both the indicators I\ and 
I2: in the leave-one-out case the indicators' values are close to zero regardless of the algorithm, 
while in the &-fold cross-validation case the stability is worsening for decreasing values of k, in 
terms of both mean and confidence intervals. This means that the networks inferred from a subset 
of data have larger distance both mutually and from the network reconstructed from the whole 
datasets, but also that these distances have larger variability. From the point of view of the 
different algorithms involved, the stricter the p-value in the FDR controlled WGCNA networks, 
the stabler the networks, with non controlled WGCNA and MINE as the worst performer in terms 
of stability. This is due to the fact that they are taking into account all possible correlation values, 
while most of the smaller values do not represent existing relations between variables, but they 
are rather a noise effect. As a first result then we showed that the use of a FDR control procedure 
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c\jco^i-Lncor^oocno 



CMCO^t-LOCDr^-OOCDO 

T-T-T-T-T-T-T-T-CVJ 



Figure 4: The correlation matrix Ms used to generate the synthetic dataset S 



for correlation help stabilizing the inference p r ocedu re, improving the performance of a method 
already acknowledged as effective (lAllen et all |2012| ). 

We move now on to discuss the stablest links and nodes in the three cases WGCNA, WGCNA 
FDR le-4 and MIC: in particular, in Tab. [2] and [3] we show the top-ranked links and nodes ordered 
for decreasing range over mean of their weights across all resampling k4. The results collected in 
the tables are consistent with the structure of the starting correlation matrix Ms and the behaviour 
of the inference algorithms. For the WGCNA case, the top 20 stablest links are those of the two 
fully connected subgroups Fi )5 = {fi \ 1 < i < 5} and -F 6j i {/i: 6 < i < 10} with largest Pearson 
correlation values in Ms- The same applies to WGCNA FDR le-4 (and with approximately the 
same values of weight range over weight mean as for WGCNA), for which these 20 links are the 
only existing (see Fig. \5§. Among the following ranked links in WGCNA, those belonging to the 
-^11,15 = {fi'- 11 < i < 15} group (whose correlation of about 0.3 was imposed as a constraint for 
Ms) are emerging, with a couple of exceptions, but with larger instability values (0.33-0.78 vs. 
0.03-0.14). The remaining links are the unstablest, displaying Range/Mean values always larger 
than 0.83: they are the randomly correlated links of Ms- It is interesting to note that the MIC 



Table 1: Statistics (mean, bootstrap confidence intervals and range) of the stability indicators I± 
and h for different instances of the WGCNA and MIC networks on the dataset S and for different 
values of data subsampling. 



ALG 


k 


I 


mean 


CI lower 


CI upper 


min 


max 


MIC 


klO 




0.052 


0.051 


0.052 


0.041 


0.067 


MIC 


klO 


1-2 


0.021 


0.021 


0.021 


0.014 


0.036 


MIC 


k2 




0.139 


0.134 


0.142 


0.112 


0.158 


MIC 


k2 


h 


0.047 


0.047 


0.048 


0.035 


0.067 


MIC 


k4 




0.055 


0.054 


0.057 


0.040 


0.071 


MIC 


k4 


h 


0.031 


0.031 


0.031 


0.022 


0.045 


MIC 


LOO 




0.008 


0.007 


0.008 


0.004 


0.011 


MIC 


LOO 


h 


0.008 


0.008 


0.008 


0.003 


0.014 


WGCNA 


klO 


h 


0.021 


0.020 


0.022 


0.011 


0.040 


WGCNA 


klO 


h 


0.028 


0.028 


0.028 


0.012 


0.064 


WGCNA 


k2 




0.070 


0.065 


0.076 


0.037 


0.108 


WGCNA 


k2 


I2 


0.070 


0.069 


0.071 


0.042 


0.117 


WGCNA 


k4 




0.039 


0.037 


0.041 


0.020 


0.062 


WGCNA 


k4 




0.046 


0.046 


0.047 


0.025 


0.088 


WGCNA 


LOO 




0.005 


0.005 


0.006 


0.001 


0.015 


WGCNA 


LOO 


h 


0.008 


0.008 


0.008 


0.002 


0.023 


WGCNA FDR lc-2 


klO 




0.023 


0.022 


0.025 


0.007 


0.074 


WGCNA FDR lc-2 


klO 


h 


0.028 


0.027 


0.028 


0.002 


0.102 


WGCNA FDR lc-2 


k2 




0.045 


0.039 


0.054 


0.014 


0.107 


WGCNA FDR lc-2 


k2 


h 


0.050 


0.048 


0.051 


0.006 


0.152 


WGCNA FDR lc-2 


k4 


Ii 


0.031 


0.028 


0.034 


0.010 


0.069 


WGCNA FDR lc-2 


k4 


h 


0.034 


0.034 


0.035 


0.006 


0.096 


WGCNA FDR lc-2 


LOO 


h 


0.015 


0.013 


0.016 


0.005 


0.035 


WGCNA FDR lc-2 


LOO 


h 


0.017 


0.017 


0.017 


0.001 


0.047 


WGCNA FDR 5e-3 


klO 


h 


0.025 


0.024 


0.027 


0.004 


0.054 


WGCNA FDR 5e-3 


klO 


h 


0.024 


0.024 


0.024 


0.001 


0.083 


WGCNA FDR 5e-3 


k2 


h 


0.033 


0.028 


0.038 


0.008 


0.070 


WGCNA FDR 5e-3 


k2 


I2 


0.044 


0.042 


0.045 


0.002 


0.121 


WGCNA FDR 5e-3 


k4 


h 


0.025 


0.023 


0.028 


0.006 


0.056 


WGCNA FDR 5e-3 


k4 


h 


0.032 


0.032 


0.033 


0.004 


0.099 


WGCNA FDR 5e-3 


LOO 


h 


0.029 


0.028 


0.031 


0.003 


0.048 


WGCNA FDR 5c-3 


LOO 


h 


0.018 


0.018 


0.018 


0.000 


0.054 


WGCNA FDR le-4 


klO 


h 


0.010 


0.009 


0.012 


0.000 


0.053 


WGCNA FDR le-4 


klO 


h 


0.014 


0.014 


0.015 


0.000 


0.055 


WGCNA FDR le-4 


k2 


h 


0.009 


0.007 


0.013 


0.001 


0.031 


WGCNA FDR le-4 


k2 


I2 


0.014 


0.013 


0.015 


0.001 


0.040 


WGCNA FDR le-4 


k4 


h 


0.009 


0.007 


0.012 


0.001 


0.049 


WGCNA FDR le-4 


k4 


I2 


0.014 


0.014 


0.014 


0.001 


0.054 


WGCNA FDR le-4 


LOO 


h 


0.010 


0.008 


0.013 


0.000 


0.044 


WGCNA FDR le-4 


LOO 




0.013 


0.013 


0.014 


0.000 


0.045 



network, due to the nature of the MIC statistics aimed at detecting relations between variables 
other than linear, displays similar but not identical results: the values of Range/Mean are confined 
in a narrower interval, and, although many links belonging to the F 15 and i^io groups are highly 
ranked, some of them can also be found in much lower positions of the standing. 

Similar considerations hold for the ranking of the stablest nodes: for WGCNA, the top ranking 
nodes are the F 15 and the F 610 (with similar Range/Mean values), with those in F 1115 come next, 
leaving the remaining five as the most unstable, with higher Range/Mean values. These five nodes, 
on the contrary, are the stablest for WGCNA FDR le-4: in fact, they are not wired to any other 
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Table 2: Top ranked links, ordered by weight range over weight mean across all 20 resampling of 
kA 4-fold cross validation, for the three algorithms WGCNA, WGCNAFDRle-4 and MIC 



WGCNA 


WGCNA FDR le-4 


MIC 


fi fj 


Range/Mean 


fi fj 


Range/Mean 


fi fj 


Range/Mean 


1 - 3 


0.03 


1-3 


0.03 


3 - 4 




0.20 


2 - 3 


0.04 


3 - 4 


0.04 


2 - 3 




0.20 


1 - 2 


0.04 


2 - 3 


0.04 


1 - 3 




0.21 


1 - 4 


0.04 


1 - 4 


0.05 


3 - 5 




0.22 


3 - 4 


0.04 


3 - 5 


0.05 


1 - 2 




0.23 


2 - 4 


0.04 


1-2 


0.05 


1 - 5 




0.25 


4 - 5 


0.04 


2-4 


0.05 


1 - 4 




0.26 


2 - 5 


0.05 


2 - 5 


0.06 


4 - 5 




0.27 


1 - 5 


0.05 


4-5 


0.06 


7 - 10 




0.28 


3 - 5 


0.05 


1-5 


0.06 


7 - 8 




0.29 


6 - 8 


0.08 


6 - 8 


0.08 


6 - 8 




0.29 


8 - 10 


0.10 


7 - 8 


0.09 


6 - 10 




0.30 


7 - 8 


0.11 


8 - 10 


0.10 


1 - 20 




0.31 


7 - 9 


0.11 


8-9 


0.11 


2 - 4 




0.31 


8 - 9 


0.11 


6-7 


0.11 


8 - 10 




0.31 


9 - 10 


0.11 


7-10 


0.12 


2 - 5 




0.32 


6 - 7 


0.11 


7-9 


0.12 


9 - 10 




0.32 


7 - 10 


0.12 


9 - 10 


0.13 


7 - 20 




0.33 


6 - 10 


0.13 


6 - 9 


0.13 


14 - 16 




0.33 


6 - 9 


0.14 


6 - 10 


0.15 


5 - 17 




0.35 


11 - 13 


0.33 






6 - 7 




0.35 


14 - 15 


0.41 






11 - 17 




0.36 


13 - 14 


0.46 






6 - 9 




0.36 


12 - 13 


0.58 






1 - 10 




0.37 


12 - 15 


0.60 






10 - 11 




0.37 


11 - 14 


0.62 






10 - 20 




0.37 


13 - 15 


0.71 






4 - 17 




0.37 


11 - 15 


0.78 






2 - 8 




0.37 


14 - 18 


0.78 






4 - 10 




0.37 


3 - 11 


0.83 






6 - 13 




0.37 


5 - 11 


0.83 






2 - 14 




0.37 


1 - 11 


0.84 






9 - 11 




0.38 


A 11 

4-11 


0.85 






15 - 16 




0.38 


3 - 10 


0.87 






15 - 17 




0.38 


5 - 16 


0.89 






7 - 13 




0.39 


8 - 17 


0.89 






9 - 18 




0.39 


2 - 11 


0.91 






12 - 19 




0.39 


8 - 12 


0.91 






6 - 18 




0.39 


4 - 13 


0.91 






8 - 9 




0.39 


1 - 13 


0.93 






4 - 18 




0.39 


3 - 13 


0.93 






16 - 17 




0.39 


8 - 13 


0.94 






4 - 19 




0.39 


9 - 17 


0.94 






16 - 19 




0.39 


1 - 16 


0.95 






7 - 19 




0.40 


1 - 10 


0.95 






5 - 8 




0.40 


14 - 16 


0.97 






14 - 15 




0.40 


5 - 10 


0.97 






13 - 15 




0.40 


11 - 12 


0.98 






4 - 11 




0.40 


12 - 16 


0.98 






7 - 9 




0.41 


2 - 13 


0.99 






13 - 19 




0.41 



node in any of the resampling, so their Range/Mean values are void. The nodes Fi j5 U F 6j i then 
follow in the ranking with small associated values, and the nodes ^11,15 close the standing with 
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Table 3: Top ranked nodes, ordered by degree range over degree mean across all 20 resampling of 
kA 4-fold cross validation, for the three algorithms WGCNA, WGCNA FDR le-4 and MIC. (*) 
indicates that Ratio and Mean are both zero. 



WGCNA 


WGCNA FDR le-4 


MIC 


fi Range/Mean 


fi 


Range /Mean 


fi 


Range/Mean 


4 


0.17 


16 


0* 


3 


0.08 


10 


0.18 


17 


0* 


19 


0.08 


3 


0.20 


18 


0* 


1 


0.08 


1 


0.21 


19 


0* 


4 


0.09 


9 


0.23 


20 


0* 


8 


0.09 


2 


0.23 


3 


0.03 


10 


0.09 


5 


0.24 


1 


0.04 


5 


0.10 


7 


0.24 


2 


0.04 


2 


0.10 


6 


0.24 


5 


0.05 


17 


0.10 


8 


0.25 


7 


0.07 


20 


0.10 


11 


0.40 


8 


0.07 


15 


0.11 


13 


0.40 


6 


0.09 


9 


0.11 


15 


0.43 


9 


0.09 


13 


0.11 


12 


0.45 


10 


0.09 


11 


0.11 


14 


0.48 


4 


0.13 


16 


0.11 


18 


0.55 


15 


4.42 


12 


0.11 


16 


0.60 


14 


7.05 


7 


0.11 


17 


0.68 


12 


22.82 


6 


0.12 


20 


0.70 


13 


26.05 


14 


0.13 


19 


1.15 


11 


41.83 


18 


0.13 



definitely higher values. In fact, although the nodes F n 15 have degree zero in the WGCNA FDR 
le-4 inferred from the whole S, some links involving them exist in some of the resampling on the 
subset of data. To conclude with, in the MIC case again the ranking values span a much narrower 
range than the other two cases, and the obtained dwranking has most of the nodes in F 15 in top 
positions, while for the other nodes the relation with the structure of Ms is very weak. 

Finally, the analogous tables for other ratios of the data subsampling schema (LOO, k2 and 
klO) are almost identical. 



3.2 miRNA network on a Hepatocellular Carcinoma dataset 

Investigating the relations connecting human microR NA (miRNA) and how they evolve in cancer 



has b een recently a key topic for researcher in biology (jVolinia et all 2010; Bandvopadh vav et al 



2010), with hepatocellular carcinoma (HCC) as a notable example (ILaw and Wong] . 1201 ll ; lGu et al. 
20121 ). In the following example, we use the stability indicators Ix,...,!^ on a recent miRNA 



microarray dataset with two phenotypes to highlight differences in the corresponding inferred net- 
works. As reconstr uction algorithm we use the Context Likelihood of Relatedness (CLR) approach 
(jFaith et all 120071 ). belonging to the relevance networks class of algorithms and generating undi- 
rected weighted graphs with weights bounded between zero and one. In particular, interactions are 
scored by using the mutual information between the corresponding gene expression levels coupled 
with an adaptive background correction step. Although suboptimal if the number of variables is 
much larger than the number of variables, it was observed that CLR performes well in terms of 
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prediction accuracy an d some CLR predictions in literature were later experimentally validated 
flAmbroise et all l2012t ). 



3.2.1 Data description 



We s tart out from the He patocellu l ar Ca rcinoma dataset introduced in the paper ( iBudhu et al. 
2008 ) and later used in ( Ji et all 2009 ). publicly available at the Gene Expression Omnibus 
(GEO, [http : //www . ncbi . nlm . nih . gov/geo/ ) at the accession number GSE6857. The dataset 
collects 482 tissue samples from 241 patients affected by hepatocellular carcinoma (HCC). For each 
patients, a sample from cancerous hepatic tissue and a sample from surrounding non-cancerous 
hepatic tissue are available, hybridized on the Ohio State University CCC MicroRNA Microarray 
Version 2.0 platform consisting of 11520 probes collecting expressions of 250 non-redundant human 
and 200 mo use microRNA (miRNA). After a preprocessing phase including imputation of missing 



values as in ( ITroyanskaya et all 1200 ll ) and discarding probes corresponding to non-human (mouse 



and controls) miRNA, we end up with the dataset HCC of 240+240 paired samples described 
by 210 human miRNA, with the cohort consisting of 210 male and 30 female patients. We thus 
parted the whole dataset HCC into four subsets combining the sex and disease status phenotypes, 
collecting respectively the cancer tissue for the male patients (MT), the cancer tissue for the female 
patients (FT) and the corresponding two datasets including the non cancer tissues (MnT, FnT). 



3.2.2 Results 

Using the CLR algorithm we first generated the four networks inferred from the whole sets of 
data and corresponding to the combinations of the two binary phenotypes: a portrait of the 
resulting graphs is depicted in Fig. [HI discarding links whose weight is smaller than 0.1. As a first 
observation, the four networks have a different structure, for instance the tumoral tissues graphs 
being more connected than the controls and the female graphs more than the corresponding male 
ones (see for instance the density values in Fig. |HJ). In particular, their mutual HIM distances 
are reported in Tab. [TJ together with the corresponding two-dimensional scaling plot, showing 
that the networks corresponding to the female patients (and, in particular, the one inferred from 
cancer tissue) are notably different from those arising from the subset of data for the male patients. 
We then computed the stability indicators I\ and I2 in the setup described in Sec. 12.2} and the 
corresponding statistics are collected and displayed in Tab. [Hand Fig. E0 

It is immediately evident the different sample size impact on the network stability: the networks 
corresponding to male patients have smaller values for I\ and I2 (and thus they are much stabler) 
than the corresponding female counterparts, and this effect is even stronger than the one due to the 
ratio of the chosen subsets of data: the leave-one-out stability for FT and FnT is worse than klO 
and k4 stability for MT and MnT. On the other hand, while control and cancer networks display 
similar level of stability in the male networks at all levels of subsampling ratio, in the female group 
the network associated to the controls is much stabler than the matching control networks, and 
this is evident when the size of the subset used for inference gets smaller, in particular for k = 2. 

Finally, to show how to use indicators I3 and I4 to extract information about stability of 
some interesting links, we first rank all links according to their weight Range/Mean value for all 
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Table 4: Statistics (mean, bootstrap confidence intervals and range) of the stability indicators I\ 
and I2 for the CLR inferred networks on the datasets MT, MnT, FT, FnT, for different values of 
data subsampling. 



PROBL 


k 


I 


mean 


lower 


upper 


min 


max 


FT 


klO 


h 


0.040 


0.037 


0.044 


0.002 


0.177 


FT 


klO 


h 


0.054 


0.054 


0.055 


0.000 


0.256 


FT 


k2 




0.069 


0.056 


0.082 


0.006 


0.154 


FT 


k2 


h 


0.089 


0.084 


0.093 


0.005 


0.250 


FT 


kl 




0.057 


0.049 


0.066 


0.004 


0.190 


FT 


k4 


h 


0.078 


0.076 


0.080 


0.003 


0.305 


FT 


LOO 




0.022 


0.016 


0.032 


0.002 


0.093 


FT 


LOO 


h 


0.032 


0.030 


0.035 


0.001 


0.143 


FnT 


klO 




0.032 


0.029 


0.035 


0.002 


0.093 


FnT 


klO 


h 


0.045 


0.044 


0.045 


0.000 


0.179 


FnT 


k2 




0.094 


0.071 


0.117 


0.006 


0.257 


FnT 


k2 




0.119 


0.113 


0.124 


0.006 


0.391 


FnT 


kl 




0.062 


0.054 


0.072 


0.005 


0.203 


FnT 


k4 


h 


0.080 


0.078 


0.082 


0.003 


0.307 


FnT 


LOO 


h 


0.022 


0.017 


0.027 


0.003 


0.048 


FnT 


LOO 


h 


0.030 


0.028 


0.032 


0.001 


0.094 


MT 


klO 


h 


0.011 


0.010 


0.013 


0.001 


0.048 


MT 


klO 


h 


0.016 


0.016 


0.016 


0.001 


0.092 


MT 


k2 


h 


0.040 


0.033 


0.051 


0.003 


0.146 


MT 


k2 


h 


0.051 


0.048 


0.054 


0.003 


0.218 


MT 


kl 


h 


0.024 


0.020 


0.029 


0.002 


0.099 


MT 


k4 


h 


0.033 


0.032 


0.033 


0.001 


0.148 


MT 


LOO 


h 


0.002 


0.002 


0.002 


0.000 


0.018 


MT 


LOO 


h 


0.003 


0.003 


0.003 


0.000 


0.030 


MnT 


klO 


h 


0.009 


0.008 


0.010 


0.001 


0.034 


MnT 


klO 


h 


0.013 


0.013 


0.013 


0.001 


0.061 


MnT 


k2 


h 


0.033 


0.026 


0.041 


0.003 


0.104 


MnT 


k2 


h 


0.037 


0.035 


0.039 


0.002 


0.158 


MnT 


kl 


h 


0.018 


0.015 


0.022 


0.001 


0.067 


MnT 


k4 


h 


0.025 


0.024 


0.026 


0.001 


0.102 


MnT 


LOO 


h 


0.002 


0.002 


0.002 


0.000 


0.009 


MnT 


LOO 


h 


0.003 


0.003 


0.003 


0.000 


0.016 



the four cases MT, MnT, FT, FnT, and then we point out six links worth a comment, listed 
in Tab. [5j The link (a) is top ranking in all four cases as expected, since hsa-mir_321Nol and 
hsa-mir-321No2 den ote essentially the same miRNA (identical or with very similar sequences, 



( Ambros et al. . 2003[ ) . The same applies to the links (b) and (c), but in these cases the stability 



of these two links in the FnT network is not as good as in the other three cases, probably due 
to the presence of noise in the data. The link (d) is interesting because of the difference of its 
stability between the male and the female networks, indicating a link probably associated to sex 
rather than HCC The behaviour of link (e) is even more singular: it is one of the stablest links 
for the FT network, while is not even picked up as a link by CLR in the FnT net work. Finally, 



link (f) is a very w e ll known connection i n lite rature, strongly associated to cancer (jVolinia et al. 



20101 ; iBraun et all 120081 ; I Georges et all . 120081 ) as confirmed by its high stability in the MT and 



FT networks only. 
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Table 5: Position in the weight Range/Mean ranking in the four cases MT, MnT, FT, FnT for 
six miRNA-miRNA links. 



id 


hsa-mirJdxl 


hsa-mir_idx2 


MT 


MnT 


FT 


FnT 


(a) 


321Nol 


321No2 


1 


1 


9 


2 


(b) 


016b.chr3 


16.2Nol 3 


12 


15 


309 




(c) 


021.prec.l7Nol 


21Nol 


27 


5 


2 


921 


(d) 


219.1Nol 


321No2 


2 


6 


1903 


314 


(e) 


326Nol 


342No2 


132 


1017 


3 




(f) 


192.2.3Nol 


215.precNol 


4 


300 


4 


3340 



4 CONCLUSIONS 

We introduced a suite of four stability indicators for assessing the variability of network recon- 
struction algorithm as functions of a data subsampling procedure. The aim here is to provide the 
researchers with an effective tool to compare either the inference algorithms or the investigated 
dataset. Two indicators are based on a measure of a normalized distance between networks and 
they are global, giving a confidence measure on the whole inferred dataset, while the other two 
are local, associating a reliability score to the network nodes and detected links. They are of 
particular interest when no gold standard is known for the studied task, so they can work as a 
substitute for the algorithm accuracy. We demonstrated their consistency on a synthetic dataset, 
and we showed their use on a high-throughput microarray experiment, with two widely known 
inference methods such as WGCNA and CLR. 
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Figure 5: Correlation networks inferred by the dataset S using (a) absolute Pearson, (b) absolute 
Pearson with FDR correction at p- value 10~ 4 and (c) MIC. Node label i corresponds to feature fi, 
node size is proportional to node degree and link colors identify different classes of link weights. 
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Figure 6: I\ and I2 stability indicators (mean and confidence intervals) for different instances of 
the WGCNA and MIC networks on the dataset S and for different values of data subsampling. 
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Figure 7: Mutual HIM distances for the four CLR inferred networks MT, MnT, FT, FnT recon- 
structed from the whole corresponding subsets and corresponding 2D multidimensional scaling 
plot. 
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(c) FT, d = 0.0206 (d) FnT, d = 0.0121 



Figure 8: CLR networks (and corresponding density values) inferred from the 4 subsets (a) Male 
Tumoral (MT) (b) Male not Tumoral (MnT) (c) Female Tumoral (FT) and (d) Female non Tu- 
moral (FnT) of the datasets HCC. Links are thresholded at weight 0.1, node position is fixed across 
the four networks, node dimension is proportional to the degree and edge width is proportional 
to link weight. 
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Figure 9: I\ and I2 stability indicators (mean and confidence intervals) of CLR inferred networks 
for different values of data subsampling on the four subgroups Male Tumoral (MT), Male not 
Tumoral (MnT), Female Tumoral (FT) and Female non Tumoral (FnT) of the datasets HCC. 
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